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ABSTRACT 

replikativ is a replication middleware supporting a new 
kind of confluent replicated datatype resembling a distributed 
version control system. It retains the order of write opera¬ 
tions at the trade-off of reduced availability with after-the- 
fact conflict resolution. The system allows to develop appli¬ 
cations with distributed state in a similar fashion as native 
applications with exclusive local state, while transparently 
exposing the necessary compromises in terms of the CAP 
theorem. In this paper, we give a specification of the repli¬ 
cated datatype and discuss its usage in the replikativ mid¬ 
dleware. Experiments with the implementation show the 
feasibility of the concept as a foundation for replication as a 
service (RaaS). 
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1. INTRODUCTION 

While building scalable distributed systems, developers 
are typically confronted with a number of (potentially con¬ 
flicting) requirements 

1. Never lose data! 

2. Always be available! Even when offline. 

3. Provide a simple API preferably with explicit consis¬ 
tency semantics^ e.g. DVCS-like, and not a distributed 
file system without history, such as dropbox. Keep a 
sequential and consistent log of all modifications. 

^A good summary from the developer of our 
browser-based database can be found at http: 
//tonsky.me/blog/the-web-after-tomorrow/, or another 
similar perspective at http://writings.quilt.org/2014/ 
05/12/distributed-systems-and-the-end-of-the-api/. 
Both links retrieved at 2015-07-18. 
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4. Support cross-platform serialization while offering strong 
and extensible data semantics. Do not tie users to a 
single platform, e.g. JSON and JavaScript. 

5. Replicate everything at once consistently: code, data 
types and referenced binary values. 

6. Do not require configuration of complex backend stor¬ 
age for simple applications. 

7. Avoid ad-hoc reimplementation of network code with 
every application and framework. 

In our replication system, replikativ, ^ we combine a 
number of technologies to meet these requirements. The 
main idea behind replikativ is to decouple the replication 
of data from the application code, so that different appli¬ 
cations can share the same data base without mandatory 
agreement on how the data is managed. This allows to fork 
application state and innovate with new applications inside 
existent user bases if the data is shared openly. Recent ad¬ 
vances in machine learning techniques, e.g. deep learning 
[1], highlight that access to large amounts of data can unlock 
new insights into different aspects of the involved processes 
and allows to evolve smarter services. As of today, the no¬ 
tion of open-source is often applied to code, while data is 
considered as the property of individual providers and only 
evaluated in their own interests. The greatest potential of 
building shared data and knowledge bases is yet largely un¬ 
tapped. 

To provide users with sovereignty over their data, rep¬ 
likativ will support a public-private key encryption system 
and our design already reflects that. It does not rely on the 
often false security assumptions of a safe internal zone ver¬ 
sus the internet and instead will encrypt the data end-to-end 
and not only the communication channels. 

In this paper, we focus on one technical core component of 
replikativ. We designed the system around a new datatype, 
named CDVCS, which decouples the conflict resolution mech¬ 
anisms which are typically hard wired in convergent repli¬ 
cated datatypes (CRDTs) [8]. The contribution of this paper 
is the documentation of this new datatype and its combina¬ 
tion with different conflict resolution strategies. Originally 
inspired by [6], we use CDVCS to implement the important 
concepts of a distributed version control system (DVCS) to 
retain convergence and scalability in our replication system. 
We have generalized replikativ furthermore to allow novel 

^Eor a more detailed description have a look at the documen¬ 
tation at https://github.com/replikativ/replikativ. 



combinations of CDVCS with arbitrary CRDTs by snap¬ 
shot isolation. This provides the developer a flexible design 
choices for different trade-offs between consistency and seal- 
ability of write operations. 

2. RELATED WORK 

The design of CDVCS trades high availability for weaker 
consistency guarantees. It is motivated by two major lines 
of work: distributed version control systems (DVCSs) and 
confluent replicated data types (CRDTs). For a general 
overview of consistency conditions and terminology, we refer 
to the survey of Dziuma et al. [4]. 

2.1 DVeS 

Today’s work flows for light-weight and open-source friendly 
software development are centered around distributed ver¬ 
sion control systems (DVCSs), such as git, mercurial or 
dares. Updates and modifications can be executed offline 
on the developer’s local replica(s) and are synchronized ex¬ 
plicitly by her to the shared code base. In terms of the 
CAP theorem [5], these systems provide availability, but al¬ 
low divergence between different replicas. To reconcile the 
system state, some after-the-fact conflict resolution has to 
be applied, e.g. through 3-way-merging mechanisms on text 
files or conflict markers in case of non-mergeable differences. 
These conflicts then have to be resolved manually. To sup¬ 
port the user, DVCSs provide a commit history which allows 
to determine the order of events and and detect when con¬ 
sistency has been broken by concurrent writes. While this 
technique has proven very effective for source code, attempts 
to transfer these systems to data have had limited success 
so far. Prominent examples are file systems that have been 
built on top of git, such as gitfs^ or git-aunex There 
have also been repeated attempts at using git directly to 
implement a database. Data management systems built on 
top of off-the-shelf DVCS exhibit a number of problems: 

1. Programs can exploit their text-oriented conflict res¬ 
olution scheme by encoding the data in text format, 
e.g. in JSON. However, this requires serialization in 
line-based text-files in a filesystem structure to be com¬ 
patible with the default delta resolution mechanism for 
automatic conflict resolution. When the diff’ing of text 
files is customized in any of these DVCSs, usually a 
complete reimplementation of operations becomes nec¬ 
essary, and the desired compatibility is lost. Instead of 
relying on textual representation, we believe that pro¬ 
viding customized data types with principled conflict 
resolution schemes is a more natural approach [9]. 

2. File systems are the historic data storage model for a 
non-distributed low-level binary view on data within 
a single hierarchy (folders), and hence cannot capture 
and exploit higher-level structure of data to model and 
resolve conflicts. Today, the preferred way to manage 
state from an application developer perspective is often 
a relational model or language-specific data structures 
as they are declarative and allow to focus on data in¬ 
stead of file system implementation details. 

3. DVCSs often do not scale when it comes to handling 
of binary blobs as they take part in the underlying 

^https://github.com/presslabs/gitfs 
^https://git-annex.branchable.com/ 


delta calculation step. For example, git then needs 
an out-of-band replication mechanism like git-annex 
to compensate, adding additional complexity to the 
replication scheme. 

We think that these attempts based on DVCSs, while be¬ 
ing close to our work, are doomed to fail due to the trade¬ 
offs captured by the CAP theorem. They try to general¬ 
ize a highly optimized workflow of a manual low frequency 
write-workload for development on source code files to fast 
evolving high frequency write-workloads of state transitions 
in databases. Much better trade-offs can be achieved by 
picking the important properties of a DVCS and composing 
it with other highly available data types. This approach al¬ 
lows to build scalable, write-workload oriented data types 
at the application level. 

By building on the CDVCS, we can use other more ef¬ 
ficient confluent datatypes for write intensive parts of the 
global state space, e.g. posts in a social network and indexes 
on hashtags. A DVCS introduces considerable overhead and 
potential loss of availability on these operations. 

2.2 Confluent replicated datatypes (CRDTs) 

While the original motivation for our system was to imple¬ 
ment a DVeS-like repository system for an ACID database 
in an open and partitioned environment of online and of¬ 
fline web clients and servers, a replication mechanism was 
lacking. DVCS systems like git track only local branches 
and do not allow propagation of conflicts and hence have no 
conflict-free replication protocol. Conflicts can show up in 
any part of the network topology of replicas during propa¬ 
gation of updates and they can only be resolved manually 
at this position. Since the system has to stay available and 
needs to continue to replicate at scale while being failure- 
resistant, we decided to build on prior work on convergent 
replicated datatypes [8]. CRDTs fulfill our requirements as 
they do not allow and need any central coordination for 
replication. They also provide a formalism to specify the 
operations on the datatype and prove that the state of each 
replica always progresses along a semi-lattice towards global 
convergence. CRDTs have found application e.g. in Riak^ 
or soundcloud® to allow merging of the network state after 
arbitrary partitions without loss of write operations. This is 
achieved by application of so called downstream operations 
on the respective local state of the CRDT. These operations 
propagate as messages through the network. While this fits 
our needs for the replication concept, it does not provide 
semantics for strong consistency on sequential operations. 

The notion of a CRDT in general implies automatic merge- 
ability of different replicas and does not lead to conflicts 
which then would need some centralized information to be 
resolved. Hence, they are usually referred to as conflict-free 
replicated datatypes. Our datatype somewhat breaks with 
this strong notion by merging conflicts, emerging as branch 
heads, from the replication mechanism into the value of the 
datatype. This allows resolution of conflicts at any point in 
the future on any replica. CRDTs so far have mostly cap¬ 
tured operations on sets, counters, last-writer wins registers 
(LWWR), connected graphs and domain-specific datatypes 
e.g. for text editing [8]. None of these datatypes allows to 
consistently order distributed writes. Other CRDTs nonethe- 

^http://basho.com/tag/erdt/ 

®https://github.com/soundcloud/roshi 



less have benefits compared to our CDVCS datatype, be¬ 
cause they cause less overhead on replication and do not 
require conflict resolution with reduced availability on ap¬ 
plication level, provided concurrency of the datatype oper¬ 
ations is acceptable. We hence generalized our replication 
with a CRDT interface and reformulated our datatype to¬ 
gether with an OR-set in terms of this interface. 

Similarly to CRDTs, cloud datatypes [2] build on commu¬ 
tativity of update operations. The design still happens from 
a cloud operator’s perspective, though, as their flush opera¬ 
tion allows explicit synchronisation with some central view 
on the data on a cloud server. All their non-synchronized 
datatypes can be implemented with commutative CRDTs. 

Close to our work are versionable, branchable and merge- 
able datatypes [7]. This work models datatypes with an 
object-oriented approach as a composition of CRDT-like 
commutative datatype primitives (e.g. sets). To resolve con¬ 
flicts, each application needs to instantiate custom datatypes 
which resolve conflicts at the application level. Therefore, 
the code for conflict resolution has to be provided consis¬ 
tently to each peer participating in replication. Having gen¬ 
eral data types and compositions thereof in contrast allows 
us to replicate without knowledge of the application and to 
upgrade the replication software of the CRDTs more gradu¬ 
ally, independent of application release cycles. It also means 
that all peers can participate in the replication no matter 
whether they have been assigned to an application or not. 

swarm, js ^ is the closest to our work. It employs op- 
based CRDTs for client replication and runs in the browser, 
allowing efficient offline applications®. In contrast, replika- 
tiv uses a dual representation of a CRDT, state-based in¬ 
memory and op-hased on runtime during operations. This 
in-memory representation allows to store an efficient local 
compression of the operation history which is straightfor¬ 
ward to implement for each CRDT and does not leak into 
the replication of operations. Further, swarm, js has not 
been designed as an open replication system. It uses a span¬ 
ning tree to minimize the replication latency of ops, while 
we build on a gossip-like protocol as building self-stabilizing 
spanning trees over the internet is still an open area of re¬ 
search [3]. Our peer-protocol can be easily extended by mid¬ 
dleware systems concerning just a single connection without 
dependencies on the code base. To our knowledge, swarm, js 
lacks a mechanism to exchange external values, most impor¬ 
tantly (large) binary values. Our system uses referenced val¬ 
ues by their platform independent hash, so datatypes only 
need to carry 32 bytes for every transaction. The referenced 
values need to be transmitted as well, of course, but can be 
structurally shared between datatypes and even commits. 

3. APPLICATION: SHARED CALENDAR 

replikativ provides essential middleware functionality 
for implementing distributed applications, covering client 
communication, durable data storage, and consistency man¬ 
agement. Let us sketch how application developers can em¬ 
ploy this functionality. 

As a basic example, consider a calendar application where 
people store their appointments and synchronize them with 
others. In the context of this paper, we simplify the calen¬ 
dar application by tracking only titles of appointments and 

^https://github.com/gritzko/swarm 
®http://swarmjs.github.io/articles/2of5/ 


their time. Each appointment is tracked as a branch. Let us 
assume that Alice and Bob want to use a shared calender to 
synchronize on a lunch appointment, alongside the otherwise 
private appointment branches as shown in Figure 1. Alice 
has to work at 2 pm, therefore she wants to eat lunch earlier 
at 1 pm. Bob has soccer practice at 3 pm, so he prefers 
lunch actually later at 2 pm. Once their clients are con¬ 
nected, both transmit their concurrent operations to each 
other. This causes a conflict because they have set a differ¬ 
ent time for their lunch. The application now notifies Alice 
and Bob to resolve the conflict. Alice merges both commits 
following a user-moderated consistency scenario section 5.3. 
The operation is then transmitted to Bob’s client and also 
applied there. 

4. CDVCS DATATYPE 

We compose a CRDT satisfying the requirements from 
Footnote 1 by implementing a DVCS with the primitives 
available from CRDTs. Our consistency requirement for 
an ACID transaction log demands a sequential history. In 
DVCS, this is captured by an add-only, monotonic DAG of 
commits which represent identities, i.e. values changing in 
time. The graph is monotonically growing and can be read¬ 
ily implemented as a CRDT [8]. To track the identities in 
the branch, we need to point to their heads in the graph. 

In a downstream update operation to a branch with head 
a, e.g. one reflecting a commit 6, the branch heads are now 
{a,b}. This is resolved in a DVCS by a lowest common an¬ 
cestor search (LCA). Whenever we want to resolve a branch 
value, i.e. its history, we need to remove all stale ancestors 
and either have only one head or a conflict of multiple ones. 
We therefore remove stale ancestors in the set of heads on 
downstream operations, so we do not need to use a CRDT 
for the branch heads. 

Combining the DAG, the sets and LCA completes our 
CRDT which we refer to as confluent DVCS or CDVCS. 

Correctness. 

To show that CDVCS behaves properly as a CRDT, we have 
to show that all operations satisfy the invariants of its meta¬ 
data. In particular the graph might never lose nodes or edges 
and always grow according to the operation. All branch 
heads must always point to leaves of the commit graph and 
might only be removed if they are non-leaves (ancestors) of 
one of the others. For (operation-based) CRDTs, operations 
are split in upstream and downstream operations, where the 
former ones are applied at the local state of a replica, leaving 
the state unmodified, and the latter ones are manipulating 
the state and are used to propagate the changes also to the 
other replicas. 

CRDT specification. 

The correctness of CDVCS heavily relies on LCA which is 
used in a typical DVCS to resolve conflicts. We use an online 
LCA version which returns a set for common ancestors and 
the subgraphs traversed to reach the ancestor(s) from each 
commit. We cover the following operations, which we refer 
to by visited: 

• commithistory: Linearizes the history back to the 
root from some commit, e.g. head of a branch, and 
loads all commit-values from memory as can be seen 
in Figure 2 on the right. It can be used to calculate 
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Figure 1: States and operations side-by-side rep¬ 
resenting dynamic processes of an example calen¬ 
dar application where the CDVCS tracks the com¬ 
plete application state. Each CDVCS has multi¬ 
ple branches that describe an appointment. Both 
instances of the CDVCSs have a branch which is 
shared and one which is private to each CDVCS. 
Conflicts can only arise in shared branches. 



Figure 2: The state of two repositories illustrating 
typical operations. Commits are represented as cir¬ 
cles with black colored head commits. Both repos¬ 
itories have ’1’ as initial commit, one shared and 
one private branch each. While Branch2 exists be¬ 
fore Branchs, Repository2 pulls all missing commits 
into a new branch from Repository!. Furthermore, 
by having two branch-heads in ’4’ and ’5’, a merge 
into ’6’ is applied based on some consistency sce¬ 
nario. The grey nodes represent the commit history 
from the head node ’10’ of Branch4 up to the root. 

the current state of the application incrementally. It 
also covers the branch history by providing the commit 
history of the current branch head. 

• commit: Commits a new value to a branch. Since it 
just carries the edge and node added to the commit 
graph and the single new branch head in a set, the 
downstream operation will ensure that it is applied 
correctly. 

• branch: Creates a new branch given a parent. This 
operation forks off new branches directly at a com¬ 
mit without creating a new one. Since it just adds a 
new branch-id and initial head, the branch is correctly 
setup. 

• pull: Pulling adds all missing parent commits to the 
graph and adds the selected head into a set for the 
branch as can be seen in Figure 2. 

• merge: Merge resolves a conflict between multiple 
branch heads H' by adding a new commit with H' 
as parents as can be seen in Figure 2 or Figure 1. 

• downstream All operations only carry additions to 
the graph and sets of branch heads. We just have 
to apply all additions to the graph and merge the 
sets of heads. Since LCA properly detects all ances¬ 
toral heads, we can calculate the currently active heads 
safely by pairwise comparison. A special case is the 
initial full-state replication. Here the unknown part of 
the remote state is fetched and added to the own state 
by following the same procedure, which is also correct 
in this case. All dependencies are always fetched before 
atomic application, so the peer is in a self-consistent 
state and can act as a data provider for other peers if 
it is used with full replication. 



















payload graph C, set H — C: commit graph; H: branch 
heads 

initial {r []}, {r} 

query commithistory (graph C, commit c) : L 
S ^ emptystackQ 
S.push{c) 

let L = topological-sort (C, [], S', {}) 
update commit (commit c) 
prepare (e) 

let (5 = c ^ [p] 

let H = {c} 
effect (C, H) 

c^cyjc 

H ^ removeancestors{H U H) 
update pull (graph (7, commit c) 
prepare (e) 

if = 1 then 
let h = H.popQ 
let C = lca{C^ h, C, c).visitedc 
let H = {c} 
effect (C, H) 

c^cuc 

H ^ removeancestors{H U H) 
update mergehranches (vector of commits H') 
prepare (e) 

\etC = H' 

let H = {e} 
effect (C, H) 

c^cuc 

H ^ removeancestors{H U H) 
merge (S) 

c^cus.c 

H ^ removeancestors{H, S.H) 

Figure 3: CDVCS: A DVCS-like datatype. Here we 
just model one branch to unclutter notation. Every 
DVCS with multiple branches can be represented 
by multiple clones of the same DVCS. [...] denotes a 
vector of elements. 

5. CONSISTENCY SCENARIOS 

Since the major difference of CDVCS compared to commu¬ 
tative datatypes is the decoupled value-level conflict resolu¬ 
tion, we now want to explore how this can be used to gain 
different trade-offs between consistency and availability in 
applications. 

An important problem in distributed application design 
are the changing scalability demands during the life cycle 
of an application. For initial prototypes, no coordination 
or user moderated coordination (section 5.3) might be suf¬ 
ficient. Once the state space and workload increases, data 
moderated (section 5.4) splits into commutative CRDTs and 
CDVCSs can render the application both correct and effi¬ 
cient with explicit semantics for the developer to monitor 
and optimize the data synchronization. A relational query 
engine can be filled continuously with this mix of datatypes 
decoupling the application level code from the CRDTs. If 
the need for strong consistency arises, only some coordi¬ 
nation mechanism has to be added, while our replication 
protocol still takes care of everything else. We pursue this 
strategy in our social network demo application topiq^ with 

^https://topiq.es The logic is executed client-side in the 


datascript^®. 

5.1 Strong consistency 

As an example for strong consistency, we consider the 
transaction log of a typical ACID relational database as 
is modelled in topiq. Such a transaction log cannot be 
modeled by traditional CRDTs in a system with distributed 
writes, since arbitrary merges of non-commutative opera¬ 
tions break consistency. 

5.2 Single writer 

In a traditional database like Datomic'^^ represented by a 
linear transaction log, strong consistency can be modeled by 
having a single writer with a single notion of time serializing 
the access to the transaction log. CDVCS naturally covers 
this application case as a baseline without conflict resolution. 

Interesting new choices are possible when different peers 
commit to some branch creating different branch heads and 
the decoupled conflict resolution comes into play. In these 
cases, conflicts can occur, but they might still be resolvable 
due to application level constraints or outside knowledge. 

5.3 User moderated consistency 

In our replication system, each user can commit to the 
same CDVCS on different peers at the same time only affect¬ 
ing her own consistency. The user takes the position of the 
central agency providing consistency. Consider as an exam¬ 
ple a private addressbook application. In this case, we can 
optimistically commit new entries on all peers (i.e. mobile 
phone, tablet, notebook), but in case where the user edits 
the same entry on an offline and later on an online replica, a 
conflict will pop up once the offline replica goes back online. 
Automatic resolution is infeasable because the integrity of 
the entry without data loss can best be provided by the user. 
Since these events are rare, user-driven conflict resolution is 
the best choice and can be implemented by the application 
appropriately in a completely decentralized fashion. 

5.4 Data moderated consistency 

Similar to the hotel booking scenario in [7], we can allow to 
book a room optimistically and then have one DVCS in the 
system updated strongly consistently on a peer which selec¬ 
tively pulls and merges in all changes where no overbooking 
occurs. It provides a globally consistent state and actively 
moves the datatype towards convergence. The advantage 
of the CDVCS datatype is that this decision can be done lo¬ 
cally on one peer, independent of the replication, while in 
[7] the central peer needs to be known and actively repli¬ 
cated to. Since the decision happens again in a controlled, 
strongly consistent environment, it can happen supervisedly 
and arbitrarily complex decision functions can be executed 
atomically. Assume for example that the preferences of a 
user in a different CRDT or database allow rebooking rooms 
in a comparable hotel nearby. In this scenario, the pulling 
operation can decide to apply further transactions on the 
database to book rooms in another hotel depending on in¬ 
formation distributed elsewhere instead of just rejecting the 
transaction. Furthermore, part of this information could be 
privileged and outside of the replication system, making it 

browser, the server only coordinates with pull-hooks, 
^^https://github.com/tonsky/datascript 
^^http: //www. datomic. com/ A commercial scalable 
database with a relational query engine. 
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Figure 4: Committing 100,000 times into a branch 
on one replica. This benchmark interacts with the 
full peer logic besides network lO. 

impossible in a system of open replication like ours to auto¬ 
matically merge values on every peer. Conflicts in term of 
CDVCS might in many cases still be resolvable by applying 
domain knowledge. 

6. EVALUATION 

We have continuously evaluated replikativ with topiq 
on a diverse set of mobile and desktop browsers and found 
that the replication behaves robustly despite the occasional 
inefficiencies occuring during development. A second appli¬ 
cation^^ is the management of data from experiments run 
on a scientific simulation cluster with the help of Datomic. 
In this case, the datatype is used manually in an interac¬ 
tive REPL to track experiments including results of large 
binary blobs, which is infeasible with git or even a central¬ 
ized Datomic alone. 

Our work so far has mostly been focused on finding the 
proper interfaces and levels of abstraction for replikativ to 
behave correctly and reliably and allow straightforward opti¬ 
mized extension to new CRDTs. But since performance and 
scaling of any distributed system are critical and tradeoffs 
need to be known, we have conducted some optimizations 
and run first benchmarks as you can see in Figure 4. Most 
importantly commit times are hold almost constant by ap¬ 
plication of Merkle-tree like partitions of the metadata. 

7. CONCLUSION 

Together with our new datatype replikativ is a promis¬ 
ing platform to provide efficient replication as a service 
(RaaS). Importantly, the available mix of datatypes together 
with replikativ allows to balance different consistency vs. 
availability trade-offs depending on the application seman¬ 
tics and scale. While we are now able to satisfy our ini¬ 
tial requirements, we are working on extended prototypes 
to benchmark and verify our approach together with the 
open source community. As an open and global network 
of replication, we plan to provide support for application 
developers who do not want to care about scaling of their 
backend either publicly or in private replication networks. 
Already now, the development of the demo applications is 

^^https://github.com/whilo/cnc 


significantly easier than having a dedicated backend, and 
feels more like management of local state in native applica¬ 
tions than the typical web development architectures. Cross¬ 
platform data semantics are achievable. Since we explicitly 
build on the research around CRDTs, our datatype seman¬ 
tics are transparent to the developer. Through the imple¬ 
mentation of new and modified CRDTs we will be able to 
adapt the replication system to new needs while keeping old 
data and applications available. 
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